gh-140036: _pydecimal: avoid slow exponentiation in floor division #140044

picnixz · 2025-10-13T13:55:24Z

For now, we'll only fix this code path and we'll need to discuss a bit more for the other @mdickinson reported (#140036 (comment)).

Issue: Python implementation of Decimal can hang in floor division #140036

picnixz · 2025-10-13T13:57:57Z

Note: I didn't add tests yet because I was too lazy. I'll do it once we agree on the code.

skirpichev · 2025-10-13T14:26:43Z

Lib/_pydecimal.py

+            if q.bit_length() < 1 + context.prec * _LOG_10_BASE_2:
+                # ensure that the previous check was sufficient
+                if len(str_q := str(q)) <= context.prec:


I think both variants are fine, but you should choose one. You shouldn't worry about string conversion limit.

It was to prevent corner cases. I don't have the time to check, but I feel that it's possible to have some q such that q.bit_length() < 1 + context.prec * _LOG_10_BASE_2 is true but len(str(q)) <= context.prec is false. But maybe this is impossible.

I also didn't want to compute str(q) before as it could be an expensive check. If if q.bit_length() < 1 + context.prec * _LOG_10_BASE_2 already fails, there is no need to compute str(q) which could also raise an exception.

It was to prevent corner cases. I don't have the time to check, but I feel that it's possible to have some q such that q.bit_length() < 1 + context.prec * _LOG_10_BASE_2 is true but len(str(q)) <= context.prec is false. But maybe this is impossible.

It's impossible if _LOG_10_BASE_2 is less or equal than real mathematical value of log_2(10) and we change condition to q.bit_length() < context.prec * _LOG_10_BASE_2.

Let's prove it:
a) q.bit_length() < context.prec * _LOG_10_BASE_2 is true
BUT
b) len(str(q)) <= context.prec is false.

We have
a) q.bit_length() < context.prec * _LOG_10_BASE_2
b) len(str(q)) > context.prec.
Since len(str(q)) and context.prec are integers, we have len(str(q)) >= context.prec + 1,
and q >= 10 ** context.prec.
Then we apply mathematical log_2:
log_2 (q) >= context.prec * log_2(10), when log_2(10) is exact value.

Since log_2(10) >= _LOG_10_BASE_2, we have log_2 (q) >= context.prec * log_2(10) >= context.prec * _LOG_10_BASE_2 > q.bit_length().
As a result, q > 2 ** q.bit_length()
And this is impossible inequation.

It seems that checking of q.bit_length() < context.prec * _LOG_10_BASE_2 would be good enough.
If it doesn't, construction of str looks like best solution.

And some small example:

>>> q = 110 >>> prec = 2 >>> q.bit_length() < 1 + prec * math.log2(10) True >>> len(str(q)) <= prec False

Thanks for the analysis.

It's impossible if _LOG_10_BASE_2 is less or equal than real mathematical value of log_2(10)

This appears to be the case as pow(2, float.fromhex('0x1.a934f0979a371p+1')) is something like 9.99...98.

And some small example:

Ok, so the 1+ was indeed too much (that's what I feared). This will also help in changing the q > 10 ** prec case as well.

Michail, thanks for a correction of the first inequality q.bit_length() < context.prec * _LOG_10_BASE_2 (1).

Unfortunately, I don't think that this inequality could be used as an "optimization". Remember, it should be an equivalent of q < 10**context.prec (2). I.e. both inequalities should have same boolean values. In particular, if (1) is false - (2) should be false, as in this case we miss verification by the second check.

This equivalency is trivial for len(str(q)) <= context.prec (3) and (2), where q >= 0 and context.prec > 0 are integers. I think we should use this, there is no possible short-cut.

It's impossible to make equivalent inequality only with q.bit_length. But we can significantly reduce the needs of calculation of len(str(q)). We already prove that if q.bit_length() < context.prec * _LOG_10_BASE_2 then len(str(q)) <= context.prec and there's no need to check. On the other hand, if q.bit_length() >= 1 + context.prec * _LOG_10_BASE_2_G, then q >= 2**(q.bit_length()-1) >= 2**(context.prec * _LOG_10_BASE_2_G) >= 2**(context.prec * log_2(10)) = 10**context.prec, where _LOG_10_BASE_2_G = float.fromhex('0x1.a934f0979a372p+1') which is slightly greater then exact value of log_2(10).

It means we could implement checking this way:

_LOG_10_BASE_2 = float.fromhex('0x1.a934f0979a371p+1') _LOG_10_BASE_2_G = float.fromhex('0x1.a934f0979a372p+1') def q_is_greater_or_equal_than_pow_10_a(q: int, a: int) -> bool: if q.bit_length() < a * _LOG_10_BASE_2: return False elif q.bit_length() >= 1 + a * _LOG_10_BASE_2_G: return True else: return len(str(q)) > a

This has more sense for me. Though, IMO such helper not worth if it's required in one or two places in code.

Yes, I forgot to ask yesterday to check for the reverse condition (I started taking my pen & paper but then had to leave). I'll go for a helper as it's used more than once just for clarity purposes.

it's used more than once

I guess, you are planning yet for another instance in #140036 (comment). Two cases - not too much...

On another hand, we have a lot of len(str) computations in the module:

$ git grep 'len(str' Lib/_pydecimal.py | wc -l 40

IMO, keeping code simple is better. Maybe we should apply instead wishful thinking that eventually int->str conversion will be fast.

picnixz · 2025-10-13T16:05:25Z

I need to go for today so I'm putting it on hold (I'll change the other places).

picnixz · 2025-10-14T10:24:55Z

So I've pushed an inlined version because in one case we need to anyway compute str_q (so I prefer forwarding the exception normally). For the other case where str_q doesn't need to be computed, I fall back to a slow path ($10^{prec}$) because otherwise it would be a behavioral change.

I need to write good tests but I need to fetch some good values for that. I'm unavailable for the rest of the week so it will need to wait.

picnixz · 2025-10-26T12:47:38Z

@tim-one I implemented your suggestion for the digits computation when not relyin on str(q) but I kept our original implementation when we anyway need to compute str(len(q)) in the end.

I agree that the a * _LOG_10_BASE_2_LO ignores the lower-bits if a is expressed with more than 53 bits. However, how would you suggest that we fix this? note that it's fine to have an imprecise value of _LOG_10_BASE_2_LO as soon as the latter is smaller than the real value of log10(2).

Lib/_pydecimal.py

Lib/test/test_decimal.py

Lib/_pydecimal.py

Co-authored-by: Mikhail Efimov <[email protected]>

_pydecimal: avoid slow exponentiation in floor division

ee617bd

picnixz requested review from serhiy-storchaka and skirpichev October 13, 2025 13:55

bedevere-app bot mentioned this pull request Oct 13, 2025

Python implementation of Decimal can hang in floor division #140036

Open

bedevere-app bot added the awaiting core review label Oct 13, 2025

skirpichev reviewed Oct 13, 2025

View reviewed changes

picnixz marked this pull request as draft October 13, 2025 16:05

bedevere-app bot removed the awaiting core review label Oct 13, 2025

refine q < 10**context.prec checks

329c31a

picnixz force-pushed the fix/decimal/infinite-prec-140036 branch from 4951511 to 329c31a Compare October 14, 2025 10:22

picnixz added 3 commits October 26, 2025 13:38

_pydecimal: add helpers for computing len(str(q)) < a

103fe1e

_pydecimal: use helpers for computing len(str(q)) < a

efbdc0a

_pydecimal: add tests for unbounded contexts

9377ab6

picnixz force-pushed the fix/decimal/infinite-prec-140036 branch 2 times, most recently from ecefabe to 9377ab6 Compare October 26, 2025 12:43

picnixz marked this pull request as ready for review November 7, 2025 09:31

bedevere-app bot added the awaiting core review label Nov 7, 2025

picnixz commented Nov 7, 2025

View reviewed changes

Apply suggestions from code review

23494fb

efimov-mikhail reviewed Nov 9, 2025

View reviewed changes

Lib/test/test_decimal.py Show resolved Hide resolved

Lib/_pydecimal.py Outdated Show resolved Hide resolved

Lib/_pydecimal.py Outdated Show resolved Hide resolved

picnixz added 2 commits November 9, 2025 14:42

address review

6b55711

reformulate comment

49d5b0d

efimov-mikhail reviewed Nov 9, 2025

View reviewed changes

Lib/_pydecimal.py Outdated Show resolved Hide resolved

efimov-mikhail approved these changes Nov 9, 2025

View reviewed changes

picnixz and others added 2 commits November 9, 2025 18:14

Update Lib/_pydecimal.py

ea3e499

Co-authored-by: Mikhail Efimov <[email protected]>

add test for _is_less_than_pow10a_use_str() slow path

b1c24b7

picnixz requested a review from tim-one November 9, 2025 17:31

Uh oh!

gh-140036: _pydecimal: avoid slow exponentiation in floor division #140044

Are you sure you want to change the base?

gh-140036: _pydecimal: avoid slow exponentiation in floor division #140044

Conversation

picnixz commented Oct 13, 2025 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

picnixz commented Oct 13, 2025

Uh oh!

skirpichev Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

picnixz Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

efimov-mikhail Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

picnixz Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

skirpichev Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

efimov-mikhail Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

skirpichev Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

picnixz Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

skirpichev Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

picnixz commented Oct 13, 2025

Uh oh!

picnixz commented Oct 14, 2025

Uh oh!

picnixz commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

picnixz commented Oct 13, 2025 •

edited by bedevere-app bot

Loading

picnixz Oct 13, 2025 •

edited

Loading

efimov-mikhail Oct 13, 2025 •

edited

Loading

efimov-mikhail Oct 14, 2025 •

edited

Loading

picnixz commented Oct 26, 2025 •

edited

Loading